Time evolution of the extremely diluted Blume-Emery-Griffiths neural network 
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The time evolution of the extremely diluted Blume-Emery-Griffiths neural network model is 
studied, and a detailed equilibrium phase diagram is obtained exhibiting pattern retrieval, fluctua- 
tion retrieval and self-sustained activity phases. It is shown that saddle-point solutions associated 
with fluctuation overlaps slow down considerably the flow of the network states towards the retrieval 
fixed points. A comparison of the performance with other three-state networks is also presented. 
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I. INTRODUCTION 

There has been much interest in understanding the 
properties and predicting the behavior of large attrac- 
tor neural networks. Of primary concern are the storage 
capacity, the ability and quality of retrieval, and the in- 
formation transmitted by the network Q . The study of 
attractor neural networks has often been guided by the 
search for networks of optimal performance. 

It has been argued recently that mutual information 
is the most appropriate concept to measure the perfor- 
mance quality, especially in sparsely coded networks p||| . 
An attempt has been made to infer the Hamiltonian, or 
energy function, of an optimally performing three-state 
network from the structure of the initial mutual infor- 
mation and a disordered Blume-Emery-Griffiths (BEG) 
( and references therein) network model has been 
obtained. This has been used to derive the specific prop- 
erties that characterize the performance of an extremely 
diluted network H . The arguments leading to the BEG 
Hamiltonian and the dynamical behavior in networks of 
other architecture have been studied recently fj]|| . 

A characteristic feature of the model is to store and 
retrieve patterns and some of their fluctuations giving 
rise, in the thermodynamic limit, to two independent lo- 
cal random fields. In contrast to the usual three-state 
network PJTcll, both fields are self-adjusting functions 
and the network does not need an externally adjustable 
threshold parameter to activate the neuron states. 

One of the further interesting aspects of the model is 
the presence of an independent order parameter macro- 
scopically characterizing these fluctuations that could 
yield a new information carrying phase. However, nei- 
ther the time evolution of this order parameter nor the 
stability of such a phase have been studied before. The 
purpose of the present paper is precisely to investigate 
these points in the extremely diluted network, which has 



an exactly solvable dynamics. This also allows one to fig- 
ure out the size of the basins of attraction, a point that 
has not been emphasized before, and that can only be 
studied to some extent in the diluted network due to the 
complexity of the interactions in the underlying model. 

The motivation for our work is the discovery of moder- 
ately and, eventually, very long transients in the dynamic 
evolution of some of the states of the network and the 
recognition that these states often drive the network into 
a retrieval phase. It also turned out that these states are 
not stable in the absence of synaptic noise (temperature 
T), in contrast to an earlier claim but that a finite, 
activity dependent, threshold value for T is required for 
these states to stabilize. We also produce a further com- 
parison of the performance of the network with other 
three-state networks. Preliminary results of this work 
have been presented recently [ jy] . 

The outline of the paper is the following. In Sec. II 
we summarize briefly the model and refer the interested 
reader to previous works for further discussion In 
Sec. Ill we state the macrodynamics and we present our 
results in Sec. IV, ending with our conclusions in Sec. V. 



II. THE MODEL 

We consider a three-state network with symmetri- 
cally distributed neuron states Oi^ t — 0,±1 on sites 
% = 1, ...,N, at time step t, where a^ t — ±1 denote the 
active states. A set of p ternary patterns, {£f = 0, ±1}, 
H = 1, ...,p, where £f = ±1 are the active ones, are as- 
sumed to be independent random variables following the 
probability distribution 



p(a = a5(\a 2 -l) + (l-a)S(0 



(1) 



where the average a = \(£f) 2 ) is the activity of the pat- 
terns. These patterns are embedded in the network by 
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means of a generalized learning rule together with a set 
{ryf} of normalized fluctuations, ryf = ((£f) 2— a)/a{l— a), 
of the binary patterns (£f ) 2 about their average. 
The learning rule consists of two Hebbian-like parts 



I p i p 

' Jij = a^N ^ Kij = w ^ 



11=1 



n a 



(2) 



which are the random interactions in the bilinear BEG 
network with the dynamic variables cr^t and a 2 t , respec- 
tively ||. These are then used to construct the random 
local fields 



N N 



(3) 



where the first one is the usual local field for a three-state 
network. This enables one to obtain an effective single- 
site energy function for neuron i, in mean-field theory, 



e»,t = — [hi,t&i,t + di y t<Ji >t ] 



which rules the state-flip probability 

p(cri,t+i\Wi,t}) = ex P(-^,t)/^> 



(4) 



(5) 



that specifies the parallel stochastic dynamics for the 
model, where Z t = 1 + 2e ll6t cosh(/3/i t ) and /3 = a/T 
is the inverse temperature (noise) parameter. 

In the sequel we do not indicate the explicit t- 
dependence. In distinction to the usual three-state model 
P,[To| , where the coefficient of the quadratic part in is 
an externally adjustable threshold parameter, we have 
here a self-adjusting, state and pattern dependent func- 
tion 6i({(Tj}, {?7fc}). In this sense, the present model be- 
longs to a wider class of "self-control" networks, a case 
of which has been discussed before B. We remark, how- 
ever, that the threshold 9 used in the latter is a macro- 
scopic parameter, thus, no average had to be done over 
the microscopic random variables at each time step t. 

Next, we consider the relevant quantities that describe 
the performance of the network. We need the conditional 
probability distribution of a neuron state given the dis- 
tribution of the patterns, p(e>i|£f)- As a consequence of 
the mean-field theory character of the model it is enough 
to consider the distribution of a single typical neuron, so 
we can omit the index i. We can then derive 

Pi°\e) = (*« + m^)S((T 2 - 1) + (1 - s t )8{a), (6) 



where 



s ? = s>* + Z A1 (f) 2 , s' 1 = 



q — an^ 
1-a 



— q 
1-a' 



(7) 



Here, m M = ((c) / a) ^ is the thermodynamic limit 
N — > oo of the retrieval overlap 



(8) 



between the state of the network and pattern {£f }, and 
the internal average is over the conditional probabil- 
ity. The other parameters are the thermodynamic limits 
q = ((er 2 )^)^ and — ((a 2 ) a \^ (£ M ) 2 / a )c , of the neural 
activity 



(9) 



and the activity overlap 



aN 2^ 1 ^ ' 



(10) 



respectively. Finally, f = ((c 2 )^^)^ is t ne thermo- 
dynamic limit of the fluctuation overlap between the 
binary state variable a 2 and defined as, 



1 n = ^Y,^<- 



(11) 



As is clear from its definition, the fluctuation overlap is 
connected with the activity overlap. 

An underlying assumption that leads to the BEG 
model and that should be preserved in the implemen- 
tation for any network architecture is that the dynamic 
activity q ~ a, as far as the order of magnitude is con- 
cerned. The necessity of such an activity control sys- 
tem has been emphasized before (cf. [§,^3) and references 
therein). 

The fluctuation overlap l^ can be viewed as the re- 
trieval overlap between the binary patterns } and 
states {of}. A priori, this will be independent of the re- 
trieval overlap m} 1 between the three-state patterns and 
states. As will be seen below, a finite non-zero retrieval 
overlap induces a finite fluctuation overlap, and in this 
case the parameter l^ should not add anything essen- 
tially new to the three-state network. It will turn out, 
however, that the inverse is in general not true. Indeed, 
l^ can be finite in a state of dynamic activity without 
necessarily a finite retrieval overlap m^. Thus, there can 
be a phase in which = but where, nevertheless, 
there is a finite information carried by the fluctuation 
overlap l^ ^ 0. Whether the new phase is actually sta- 
ble, interesting and how long it takes for the network to 
reach the corresponding states is the main issue that we 
address below. 

Finally, we consider the mutual information between 
patterns and neurons, regarding the patterns as the in- 
puts and the neuron states as the output of the network 
channel at each time step jl4|,|l5| . This is given by, recall- 
ing that we do not indicate the explicit time dependence, 
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I»(a,e)=S(<T)-{S(*\e)k», (12) 

where 

S(a) = -q\n(q/2)-(l-q)\n(l-q) (13) 

is the entropy and S(a\^))^ — aS a + (1 — a)Si- a is the 
equivocation term with 

S a = -4 In - lnc^ - (1 - n") ln(l - n") 

Si- a = -a" ln(s^/2) - (1 - a") ln(l - a"). (14) 

Here, c± = (rt M ± m fl )/2 and s^ 1 is the parameter in the 
conditional probability p(er|£ M ). The mutual information 
can then be used to obtain the information i M = I^a, 
where a = p/N is the storage ratio of the network. 



together with the dynamic activity qt = an t + (1 — 
a)s t . The equation for l t is obtained using the rela- 
tion l t = (n t — 9t)/(l — a). Here, as usual, Dx — 
exp(— x 2 /2) dx/ V27T whereas A t 2 = aq t /a 2 and A t 2 /(1 — 
a) 2 are the variances in the Gaussian local fields hi and 6i, 
respectively. With these equations we also get the time 
evolution of the information i by means of Eqs. (12) to 
(14). 

The equations for the dynamics of the macroscopic or- 
der parameters can now be used to study both the time 
evolution of the network and to determine the properties 
of the stable stationary states. 



IV. RESULTS 



III. MACRODYNAMICS 

The local fields in terms of the retrieval and the fluc- 
tuation overlaps 



(15) 



are the suitable starting point for the macrodynamics || . 
The actual overlaps 



(16) 



depend on the thermal averages at and of given 
by Fp(h,6) = 2e' 38 smh((3h)/Z and Gp(h,6) = 
2e° e cosh(/3h) / Z , respectively. In the zero temperature 
limit 

F 00 =sgn(h)Q(\h\+6), = &{\h\ + 6). (17) 

We assume that a single pattern £ M and fluctuation if" 
are condensed, that is, m M and are of order 0(1) for a 
given [i — v (and for [i 7^ v they are of order 0(1/ V^V)), 
and we call these m and I, respectively. These yield a 
finite signal term for each local field and related Gaus- 
sian noise terms. In accordance with this, we also assume 
that n = n" = 0(1) and s = s v = O(l) and we denote 
% = i v . 

The asymptotic macrodynamics for the extremely di- 
luted network follows then the single-step evolution equa- 
tions for the order parameters, exact in the largc-N limit, 
for each time step t , 

mt+i= / Dy [ Dz + yA t ; 1 ^ + z-^), (18) 

J J a a 1 - a 



nt+i = j Dy J Dz G^ + yA t ; ± + z- 
st+i = J Dy J Dz G {yA t ;-- 



), (19) 



(20) 

— a 1 — a 



We recognize, essentially, three phases given by stable 
stationary states of the network dynamics Eqs. (18)-(20), 
as shown in Fig.l, for a typical activity of a = 0.8. There 
is a retrieval phase R (m 7^ 0, / 7^ 0), a fluctuation phase 
Q (m = Q, / 7^ 0) and a self-sustained activity phase S 
(m = 0, 1 = 0), referred to as the zero phase, Z, in pre- 
vious works pjlljl, all for q ~ a. The stationary states 
in these phases are indicated as attractors (a) in the side 
table of the phase diagram. There are also saddle-point 
solutions (s) either with m — 0, I 7^ 0, q ~ a or with 
m = 0, I = 0, q ~ a, denoted also by Q and S, respec- 
tively. Both the saddle-points in Q and S have attractor 
directions along I, towards I* 7^ and I* = 0, respectively, 
and repeller directions along m away from m = 0. Hence, 
they have strictly one-dimensional basins of attraction in 
the two-dimensional order parameter subspace of m and 
I, and then only if the precise initial condition mo = is 
met. 

There is a retrieval phase in regions I to V. It is the 
only stable phase in regions I to III and it coexists with 
the self-sustained activity phase in regions IV and V. The 
latter implies that in these two regions the basin of at- 
traction of the retrieval states is always limited by the 
attracting self-sustained activity states. On the other 
hand, the fluctuation phase exists only in regions VI and 
VII, coexisting in the latter with the self-sustained activ- 
ity phase which, in turn, exists only in that region and 
in region VIII. Dotted lines in the phase diagram denote 
continuous phase boundaries while full lines indicate dis- 
continuous transitions. The phase boundaries denoted 
by thick lines mark the boundary of the retrieval phase, 
the ones further to the right yield the critical storage 
capacity a c , where both overlaps disappear. Extensive 
calculations of the a dependence of the order parame- 
ters were performed to obtain the phase diagram, with 
particular emphasis in the search for possible stable Q 
states. Clearly, it can be seen that, for a given activity 
a, these states appear only above a certain threshold in 
T. 
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A similar behavior appears for other big values of a 
above a minimum, whereas a different behavior sets in 
for smaller a, as shown in Fig. 2 for T = T(a) and 
a = 0. Indeed, when a is less than 1/2 there is a contin- 
uous phase boundary at T — 2/3 between the retrieval 
phase at low T and the self-sustained activity phase at 
high T. At a = 1/2 the transition becomes discontinu- 
ous and, up to a = 0.698, the only phases present are 
still R and S. The transition remains discontinuous for 
small a and it becomes continuous for bigger values of 
a. For a — 0, the Q phase starts to appear with increas- 
ing a at a triple point with a = 0.698 and T = 0.767. 
Beyond a = 0.698 the transition between the R and the 
Q phase remains discontinuous up to the tricritical point 
with a = 0.711 and T = 0.785. For bigger values of a, 
the transition between the R and the Q phase remains 
continuous and ends at T = 1/2 for a = 1. It also turns 
out that there is a critical a — 1/ir « 0.318, for a = 1. 

We discuss next the typical a dependence of the order 
parameters that yield the phase diagram of Fig. 1 and 
we also show the information content of the network be- 
low and above the threshold where the Q phase starts 
to appear. For a = 0.8 where the threshold is given by 
T w 0.45 for a = 0.221, we obtain the results shown 
in Fig. 3. Clearly, for T below that threshold (left fig- 
ure) the two overlaps m and 1 remain finite together, in 
a behavior characteristic of retrieval, up to the critical 
a c . Thus, in this regime the fluctuation overlap does not 
yield anything essentially new that is not contained in 
the retrieval overlap. In contrast, above the threshold 
(right figure) the retrieval overlap disappears first with 
increasing a leaving a finite I ^ that describes a fluctu- 
ation overlap up to a bigger critical value a c . Hence, it 
is necessary that first T and then a become large enough 
for the Q states to become stable. It is also worth noting 
that the fluctuation overlap carries a finite information 
even with zero pattern retrieval overlap. Thus, although 
the information transmitted by the network is mainly in 
the retrieval phase, there is also some information due to 
the Q phase. This information is provided by the fact 
that the active neurons coincide with the active patterns 
but the signs are not correlated. One might imagine an 
example in pattern recognition where, looking at a black 
and white picture on a grey background, this phase would 
tell us the exact location of the picture with respect to 
the background without finding the details of the picture 
itself. 

We also show in Fig. 3 the comparison of the per- 
formance with two other three-state networks. One is 
the usual network with an externally adjustable optimal 
threshold parameter ||[l0) that appears in the quadratic 
part of the single-site energy function, formally the same 
as Eq.(^) but with a uniform 9 = 0j. The parameter is 
restricted to be positive and chosen to yield the largest 
mutual information. Allowing the threshold to become 
negative would essentially mimick a binary model and 



this is not the subject of the present work. The sec- 
ond network is a phenomenological extension to finite 
T of a recent three-state self-control model (SCT) [|]|lj] 
in which the self-control threshold 8 t at T = is re- 
placed by a linearly shifted threshold 8t — 9 t — T, where 
0t = v2 In a D t with D\ = aqt / a being the variance of 
the noise. The results of Fig. 3 clearly show that the 
BEG network has a better performance for high T, at 
least as far as the information content is concerned, than 
the optimal threshold network, and a worse performance 
for lower T. 

To understand the typical behavior of the dynamics of 
the network we show in Fig. 4 the time evolution of the or- 
der parameters and the information content, for a = 0.8 
and T = 0.6, in both the BEG and the SCT network. In 
support of the phase diagram shown in Fig. 1, it can be 
seen that in the case of the BEG network with increasing 
a, one has first the asymptotic states of an R phase, then 
the states of a Q phase and, finally, the states of the S 
phase. In all cases where one would expect a Q state, 
we start with the most favorable initial overlaps for that 
state, that is mo — and a small but finite I. In contrast, 
for the SCT network and the indicated values of a one 
has only an R phase. 

A closer examination of the curves for the BEG net- 
work reveals that the fluctuation overlap may "drive" 
a vanishingly small initial retrieval overlap, meaning al- 
most no recognition of a given pattern by the state of 
the network, into an asymptotic state with finite recog- 
nition. This is in contrast with the expectation for other 
three-state networks, as in the case of the SCT network, 
where first the overlap mt becomes non-zero. It is also 
worth noting that, with a very small initial mo, the states 
of the network are expected to pass through the vicinity 
of a saddle point, with a finite fluctuation overlap / and 
still a vanishing retrieval overlap at small or intermediate 
times. This situation is described by the first plateaus in 
q, I and i. It is only in passing beyond those plateaus, 
which may take a rather long time, that the states attain 
the asymptotic behavior of the retrieval phase. It also 
turned out that with the initial conditions used for the 
BEG network, in the left part of the figure, there is no 
retrieval in the SCT network, meaning that the basins of 
attraction for retrieval are larger in the BEG network. 

Finally, the results for the stationary states are con- 
firmed by a set of flow diagrams, a particular one is shown 
in Fig. 5 for a = 0.8 and T = 0.6, first below the R-Q 
phase boundary, where the retrieval state R is stable, and 
then above, where the Q phase is stable. Clearly, in the 
first case, for a small initial retrieval overlap and a finite 
fluctuation overlap, the states evolve first in the attractor 
direction of the saddle point and only then they start to 
flow towards the true (retrieval) attractor. Similar flow 
diagrams were obtained for other sets of parameters and 
in all cases we found that the attractors have fairly large 
basins of attraction. 
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V. CONCLUSIONS 
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We discussed the dynamic evolution and the station- 
ary states of the recently introduced BEG neural net- 
work model for an extremely diluted architecture. We 
made particular emphasis on the stability of the station- 
ary states, which had not been explored before, and found 
that the new phase Q (called previously, the "dipolar" or 
"quadrupolar" phase), characterized by a zero retrieval 
overlap and a finite fluctuation overlap, is a true stable 
phase only for moderately to large pattern activity a. 
We found that an activity dependent synaptic noise has 
a relevant role in deciding whether the new phase can be 
reached or not. In particular, that phase is not a stable 
one at T = for any activity smaller than one. It is also 
not stable, in general, for somewhat higher values of T. 

We also found that the dynamics may be slowed down 
due to the presence of saddle-point solutions in the equa- 
tions that appear in large regions of the phase diagram, in 
particular in the retrieval phase and close to the critical 
phase boundary. Although the specific results obtained 
here are for the extremely diluted network, some of the 
features found may also appear in other architectures, 
for instance, in a layered network, and there is work in 
progress for that case B . It would be interesting to study 
the time evolution of the BEG network also in other non- 
trivial dynamics. 
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FIG. 1. The (T, a) phase diagram for the extremely diluted BEG network with pattern activity a = 0.8. Full (dotted) lines 
denote discontinuous (continuous) transitions. The heavy lines denote the boundary of the retrieval phase R and the other lines 
the boundaries of the fluctuation-overlap phase Q and the self-sustained activity phase S. The solutions are either attractors 
(a) or saddle points (s) . 
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FIG. 2. The (T,a) phase diagram for the extremely diluted BEG network with the load a = 0. Full (dotted) lines denote 
discontinuous (continuous) transitions . 
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FIG. 3. The order parameters m, I, and q, and the information content i, in the stationary state, for initial overlaps mo = 1, 
lo = 1 and qo = a, as functions of the load a, for the BEG network with pattern activity a = 0.8 and two noise levels: T — 0.2 
(left) and T = 0.6 (right). The self-controlled threshold network (SCT), in dashed lines, and the optimal threshold network 
(dots), are also shown . 
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FIG. 4. Time evolution of the order parameters m, I and q and information content i, for pattern activity a = 0.8, 
temperature T — 0.6 and load a, as indicated. The BEG network (left), for the initial overlaps mo = lo = 10~ 5 , except for 
a — 0.2 (mo = lo = I); the SCT network (right), for the initial overlaps mo = 10~ 3 and lo = 1; both with qo = a . 




FIG. 5. The flow diagram (l,m) for the BEG network with a — 0.8 and T = 0.6. For a — 0.1 (left) the stable attractor is 
R, indicated by the circle, while Q, indicated by the cross, is a saddle point. For a = 0.15 (right) the stable attractor is Q . 
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